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Abstract 

A  reliability  trend/growth  analysis  methodology  for  satellite 
systems  is  suggested.  A  satellite  system  usually  consists  of 
many  satellites  successively  launched  over  many  years,  and 
its  satellites  typically  belong  to  different  satellite 
generations.  This  paper  suggests  an  approach  to  reliability 
trend/growth  data  analysis  for  the  satellite  systems  based  on 
grouped  data  and  the  Power  Law  (Crow-AMSAA)  Non- 
Homogeneous  Poisson  process  model,  for  both  one  (time) 
and  two  (time  and  generation)  variables.  Based  on  the  data 
specifics,  the  maximum  likelihood  estimates  for  the  Power 
Law  model  parameters  are  obtained.  In  addition,  the 
Cumulative  Intensity  Function  (CIF)  of  a  family  of  satellite 
systems  was  analyzed  to  assess  its  similarity  to  that  of  a 
repairable  system.  The  suggested  approaches  are  illustrated 
by  a  case  study  based  on  Tracking  and  Data  Relay  Satellite 
System  (TDRSS)  and  Geostationary  Operational 
Environmental  Satellite  (GOES)  data. 

1.  Introduction 

The  objective  of  this  study  is  to  develop  a  reliability  growth 
analysis  methodology  applicable  to  satellite  systems.  A 
satellite  system  usually  consists  of  many  satellites 
successively  launched  during  many  years,  and  its  satellites 
can  belong  to  different  satellite  generations.  For  example, 
the  United  States  National  Environmental  Satellite,  Data, 
and  Information  Service  (NESDIS)  is  now  developing  its 
fourth  generation  (gen.)  of  the  GOES  satellites.  The  GOES 
first  satellite,  GOES  1,  was  launched  in  1975  and  the  latest, 
GOES  15,  was  launched  in  2010  (see  Table  2). 

During  the  system  life,  its  satellites  can  be  in  different 
states,  like  active,  in-orbit  testing,  failed,  standby,  retired, 
etc.  The  satellite  system  reliability  improvements  are  based 
on  the  analysis  of  anomalies  (failures)  observed  on  the  in- 
orbit  satellites,  and  the  respective  corrective  actions  can 
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usually  be  implemented  only  in  the  next  and  the  following 
satellites  to  be  launched.  In  other  words,  the  traditional 
reliability  growth  "Test-Analyze-Fix"  concept  is  not 
applicable  to  the  on-orbit  satellite  systems,  which  makes  the 
data  model  and  data  analysis  of  the  satellite  systems  rather 
different. 

2.  Data  and  Reliability  Growth  Model 

A  satellite  system  (SS)  is  considered.  Let’s  assume  that  the 
SS  currently  consists  of  k  satellites  Si,  S2,  ...Sk  ,  where  Si 
is  the  first  (oldest)  successfully  launched  satellite,  S2  is  the 
second  satellite,  .  .  .  ,  and  Sk  is  the  latest  successfully 
launched  launched  satellite.  Let  Tu  T2  ,  .  .  .  Tk  denote, 
respectively,  the  cumulative  times  during  which  the  Si,  S2, 
...Sk  anomalies  were  recorded,  and  let  N\,  V2  ,  .  .  .  Aj, 
denote  the  random  numbers  of  corresponding  failures 
(anomalies).  These  data  can  be  represented  using  Table  1. 

The  Crow-AMSAA  model  is  suggested  to  apply  for  the  SS 
reliability  trend  analysis.  This  model  is  the  most  popular 
reliability  growth  model.  The  model  is  used  in  Military 
Handbook  189  (MIL-HDBK  -189  C,  2011).  The  model  was 
applied  in  the  following  traditional  form: 

m  =  d0  ptp-1  (i) 

where  \(t)  is  the  ROCOF  for  a  given  satellite,  10  and  f  are 
positive  parameters,  and  t  is  the  satellite  order  number,  so 
that  the  variable  t  takes  on  the  following  values:  1,  2,  3,  4,  5, 
.  .  .  .  Other  choices  of  the  independent  variable  t  can  be 
budget  or  other  monetary  or  manpower  resources  spent  to 
improve  the  satellite  reliability.  It  should  be  noted  that  in  the 
case  of  reliability  growth,  the  parameter  /?  should  satisfy  the 
following  inequality:  0  <  ft  <  1.  The  model  (1)  is 
sometimes  referred  to  as  the  Weibull  process,  because  it 
coincides  with  the  failure  rate  of  the  Weibull  distribution. 
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Satellite 

Time  Interval 

Number  of 
Anomalies 

s, 

T  i 

Ni 

S2 

t2 

n2 

Sk 

Tk 

Nk 

Table  1.  Satellite  System  Anomaly  Data 


3.  Data  Analysis 

For  each  satellite  of  the  system  considered,  the  ROCOF 
estimate  is  calculated  as 

A(0  =!p  1  =  1,2..., ft  (2) 

*  i 

where  the  estimator  (2)  is  known  as  the  natural  estimator 
of  ROCOF  (Basil,  A.P.  &  Rigdon  S.E.,  2000;  Crowder,  M. 
J.,  Kimber  A.  C.,  Smith,  R.  L.,  &  Sweeting,  T.  J.,  1991)  . 

Assuming  that  ROCOF  is  constant  (but  different)  for  each 
satellite,  it  is  clear  that  Ni  is  distributed  according  to  the 
Poisson  distribution  with  the  mean  equal  to  A(i)T„  where 
A(z)  is  the  unknown  true  value  of  ROCOF  for  ;th  satellite.  If 
the  number  of  the  observed  failures  N\  is  great  enough,  the 
distribution  can  be  approximated  by  the  Normal 
Distribution,  having  the  same  mean  and  the  variance  equal 
to  this  mean. 

Based  on  the  above  considerations,  the  following  regression 
model  (3)  is  suggested  for  estimating  the  parameter  of  the 
Crow-AMSAA  model  (1) 


I  I  AM 

i  —  l  L 

(4) 

L —  1 

and  its  logarithm  as 

k 

ln(L(A0,/3 ))  =  ^  AljdnCAo)  +  Nt  ln(/3 )  +  ••• 

(5) 

i=l 

AliQS  -  1)  ln(ti )  +  Ni  ln(Tt )  -  A0/?tf_1T£  -  ln(Ni'.) 

(5) 

Writing  the  first  derivatives  of  (5)  with  respect  to  A0  and  ft 
and  equating  them  to  zero,  we  arrive  at  the  following  system 
of  non-linear  equations  for  A0  and  /?: 

i=l  i=l 

(6) 

k 

i=l 

k 

(7) 

(7) 

i=l 


which  must  be  solved  under  the  restrictions:  10  >  0  and  1  > 
/?  >  0. 

4.  Case  Study:  Tracking  and  Data  Relay  Satellite 
System 

The  Tracking  and  Data  Relay  Satellite  System  (TDRSS)  is  a 
network  of  satellites  (each  called  a  Tracking  and  Data 
Relay  Satellite  or  TDRS)  and  ground  stations  used  for  space 
communications.  The  TDRSS  space  segment  currently 
consists  of  nine  on-orbit  TDRSs  located  in  geosynchronous 
orbit,  distributed  to  provide  global  coverage. 


A(ti)  =  A0/?tf_1  +8,  (3) 

where  5,  is  a  normally  distributed  error  with  zero  mean  and 
the  variance  is  inversely  proportional  to  the  number  of  the 
observed  failures  M,;  t\  is  the  satellite  order  number,  taking 
on  the  following  values:  1,  2,  3,  ...  .  The  model  (3)  is  non¬ 
linear  in  the  parameters  regression  model,  where  parameters 
A0  and  /?  should  be  estimated  under  the  following 
restrictions:  A0  >  0  and  1  >/?  >  0. 

Another  way  to  estimate  the  parameters  of  the  reliability 
growth  model  (1)  is  to  apply  the  Maximum  Likelihood 
(ML)  approach.  For  the  data  discussed  above,  the 
likelihood  function  L(A0,  fi)  can  be  written  as 

La „,/?)  =  •••  w 

y-r  (f0Ti  _1dr)  ‘  exp  (-  AoPtf^dr) 

~  1  I  N~\  ^ 


The  available  data  on  the  TDRSs  are  327  NASA  Spacecraft 
Orbital  Anomaly  Report  System  (SOARS)  records  related 
to  the  satellites  of  the  first  TDRS  generation  (A,  C,  D,  E,  F 
and  G)  and  the  second  TDRS  generation  (H,  I,  J),  listed  in 
Table  2.  It  should  be  noted  that  there  is  much  less  data  on 
TDRS  H,  I  and  J  (only  about  25  cumulative  mission  years) 
compared  to  the  first  generation,  i.e.,  TDRS  A,  C,  D,  E,  F 
and  G  (about  101.4  cumulative  mission  years). 

The  Crow-AMSAA  model  (1)  and  the  data  from  Table  2 
were  used  for  the  reliability  trend  analysis.  The  parameters 
of  the  reliability  growth  model  were  estimated  as:  A0  = 
3.156917  1/day  and  p  =  0.006.  The  ROCOF  estimates  and 
the  fitted  Crow-AMSAA  model  are  shown  in  Figure  1 
below.  The  model  provides  a  good  fit  to  the  data:  the 
squared  correlation  coefficient  R2  =  0.963.  Using  the  fitted 
model,  the  ROCOF  for  the  future  TDRS  M  was  predicted  as 
0.00 15 1/day.  The  predicted  value  indicates  a  30%  -  40% 
reliability  growth  for  TDRS  13  (TDRS  M)  compared  to 
TDRS  10  (TDRS  J)  in  terms  of  ROCOF. 
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Generation 

(Gen.) 

Satellite 

Name 

Other 

Satellite 

Name 

Launch  Date 

Last  Record 
Date 

Time 

Interval, 

days 

Number  of 
records 

ROCOF 
(Aest)  1/day 

1 

TDRS  A 

TDRS  1 

4-Apr-83 

5/04/2006 

8431 

192 

0.0228 

1 

TDRS  C 

TDRS  3 

29-Sep-88 

9/28/2004 

5843 

28 

0.0048 

1 

TDRS  D 

TDRS  4 

13 -Mar-89 

11/02/2010 

7904 

35 

0.0044 

1 

TDRS  E 

TDRS  5 

2 -Aug-91 

1 1/06/2004 

4845 

21 

0.0043 

1 

TDRS  F 

TDRS  6 

13-Jan-93 

7/16/2006 

4932 

20 

0.0041 

1 

TDRS  G 

TDRS  7 

13-Jul-95 

6/4/2009 

5075 

8 

0.0016 

2 

TDRS  H 

TDRS  8 

30-Jun-00 

9/02/2010 

3716 

5 

0.0013 

2 

TDRS  I 

TDRS  9 

8-Mar-02 

9/26/2010 

3124 

12 

0.0038 

2 

TDRS  J 

TDRS  10 

4-Dec -02 

7/21/2009 

2421 

6 

0.0025 

Table  2.  Data  and  Estimated  Rate  of  Occurrence  of  Failures 


TDRS  ROCOF  (k),  1/day 


Figure  1.  Estimated  ROCOF  and  fitted  reliability  growth 
model.  The  extreme  right  point  is  the  projected  ROCOF  for 
TDRS  M  (13). 
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Figure  2.  GOES  ROCOF  dependence  on  satellite 
operational  number.  The  interval  [4,  7]  is  the  first  generation 
of  satellites;  the  interval  [8,  12]  is  the  second  generation  of 
GOES  satellites. 


5.  Reliability  Growth  Model  with  Two  Variables 

Earlier,  we  applied  the  power  law  (Crow-AMSAA) 
relationship  to  model  satellite  ROCOF  dependence  on  the 
satellite  order  (operational)  number.  The  relationship  we 
are  going  to  introduce  below  can  be  used  to  take  into 
account  a  possible  jump  of  ROCOF  attributed  to  a  new 
satellite  generation,  which  is  illustrated  by  the  GOES 
ROCOF  (see  Figure  2  and  Table  3). 


Figure  2  shows  a  significant  jump  in  ROCOF  between  the 
first  generation  and  the  second  generation  of  GOES.  This 
increase  in  ROCOF  of  the  second  generation  can  be 
explained  by  more  complex  satellite  design  and  functions. 
The  figure  also  reveals  a  minor  ROCOF  increase  for  each 
last  satellite  of  the  first  generation  and  the  second 
generation.  The  GOES  7  increase  in  ROCOF  compared  to 
its  predecessor  GOES  6  can  be  attributed  to  the  GOES  7 
new  feature  -  it  was  the  first  GOES  satellite  capable  of 
detecting  406  MHz  distress  signals  from  emergency  beacons 
carried  aboard  aircraft  and  vessels  and  sending  them  to 
ground  stations.  In  its  turn,  the  GOES  12  increase  in 
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ROCOF  (compared  to  GOES  11)  can  be  attributed  to  the 
new  instrument  —  GOES  12  was  the  first  satellite  to  carry  a 
Solar  X-Ray  Imager  (SXI)  type  instrument. 

In  order  to  take  into  account  a  ROCOF  dependence  on  the 
satellite  generation,  the  following  model  is  suggested: 

A(ti,r)  =  A0tf1r/2  (8) 

The  model  has  the  following  two  independent  variables- 
operational  number  (t\  =  4,  5,  ,  12)  and  a  dummy 

variable,  7),  (j  =  1,  2,...,  J),  which  is  the  satellite  generation 
order  number.  The  dummy  variable  T  value  is  e  for  the  first 
generation,  i.e.,  7)  =  e,  and  T  takes  on  the  value  e°  =  1  for 
the  second  satellite  generation,  i.e.,  T2  =  1.  The  choice  of 
these  values  becomes  obvious  if  we  take  the  natural 
logarithm  of  (8)  in  order  to  make  the  model  linear: 

ln(/l(tj,T))  =  ln(A0)  +  ft  +  ft  In  (7))  (8.1) 

It  is  clear  that  the  transition  from  the  first  generation  to  the 
second  generation  changes  the  intercept  of  the  above  linear 
dependence  by  ft  because  of  a  unit  change  in  111(7)),  i.e., 
ln(T’i)  -  ln(7\)  =  1.  The  variable  7}  can  be  called  the 
generation  code.  The  available  GOES  ROCOF  data  needed 
to  fit  the  above  model  are  given  in  Table  3. 


GOES 

Gen. 

Gen. 
Code (T) 

ln(T) 

GOES 

Oper. 

Number  (t) 

ROCOF 

(ft  St  5  ) 

1/day 

1 

EXP(l) 

1 

4 

0.00956 

1 

EXP(l) 

1 

5 

0.00667 

1 

EXP(l) 

1 

6 

0.00435 

1 

EXP(l) 

1 

7 

0.00828 

2 

EXP(O) 

0 

8 

0.04848 

2 

EXP(O) 

0 

9 

0.02940 

2 

EXP(O) 

0 

10 

0.01445 

2 

EXP(O) 

0 

11 

0.00638 

2 

EXP(O) 

0 

12 

0.01239 

Table  3.  GOES  History  and  Estimated  Rate  of  Occurrence 
of  Failures  (ROCOF) 


Using  the  above  data,  the  parameter  estimates  of  model 
(8.1)  are  given  in  Table  4. 


Parameter 

Estimate 

Std. 

Err. 

1(6) 

p-level 

ln(A0) 

0.4017 

2.3920 

0.1679 

0.8722 

Pi 

-2.1074 

0.7308 

-2.8839 

0.0279 

Pi 

-1.9407 

1.0380 

-1.8697 

0.1107 

Table  4.  Regression  analysis  summary  of  model  (8.1) 


As  it  follows  from  Table  4,  the  parameter  ln(T0)  is 
statistically  insignificant,  so  that  our  model  (8)  can  be 
written  as: 


A{tltT)  =  t^T?2  (8.2) 

The  fitted  model  is  shown  in  Figure  3. 

GOES  ROCOF  (A),  1/day 


Operational  Number 


Figure  3.  The  GOES  ROCOF  and  fitted  model  (8.2) 


Based  on  the  model  and  data,  the  jump  in  values  of  ROCOF 


A(8,T2)  . 


A(7,7i) 


is  about ! 


In  order  to  compare  the  reliability  growth  rate  for  GOES 
generations  1  and  2,  the  following  ROCOF  model  was  fitted 
for  each  generation: 

A(ft)  =  vf  (8.3) 


The  fitted  models  are  shown  in  Figures  4  and  5. 


GOES  Generation  1  ROCOF  (1),  1/day 


Idealized  Repairable  System  CIF 


X  =  0.01761  -0.553 


Figure  6.  Cumulative  Intensity  Function  of  Idealized 
Repairable  System 

Figures  7a  through  7i  display  the  real  CIF  for  a  variety  of 
GOES  missions.  These  cumulative  intensity  functions  have 
shapes  similar  to  the  idealized  CIF. 


GOES  Generation  2  ROCOF  (1),  1/day 
y  =  331.39t-4.297 


Figure  5 
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Figure  7a 


GOES  6  CIF 


GOES  5  CIF 


Days  after  Launch 


Figure  7b 
GOES  7  CIF 


6.  Failure  Time  Occurrences  during  Each  Mission 

Based  on  its  cumulative  intensity  function  (CIF),  each 
satellite  in  a  system  of  satellites,  such  as  GOES,  can  be 
considered  as  a  repairable  system.  The  cumulative  intensity 
function  of  an  idealized  repairable  system  is  depicted  in 
Figure  6.  At  the  beginning  of  mission,  the  CIF  is  concave 
down  (has  a  decreasing  derivative  (ROCOF)).  This  part  of 
system  mission  lifetime  corresponds  to  the  reliability 
growth.  Then  CIF  becomes  approximately  linear,  which 
corresponds  to  constant  in  time  ROCOF  and  normal  (from 
reliability  standpoint)  system  operation.  At  the  end  of 
system  life,  the  CIF  becomes  concave  up,  corresponding  to 
increasing  ROCOF,  and  this  part  of  the  system  mission 
lifetime  corresponds  to  the  reliability  deterioration  (aging). 


Days  after  Launch 


Days  after  Launch 


Figure  7c 


Figure  7d 
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Days  after  Launch 
Figure  7  e 

GOES  10CIF 


Days  after  Launch 
Figure  7f 
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Figure  7g 
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Figure  7h 
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Figure  7i 
7.  Conclusion 

In  this  study  we  set  out  to  suggest  a  reliability  trend/growth 
analysis  methodology  for  satellite  systems.  We  used  the 
number  of  recorded  anomalies  for  a  given  satellite  mission 
over  a  given  time  period  of  operation  as  the  data  to  measure 
this  growth.  Using  this  data,  we  modeled  reliability  growth 
as  both  a  function  of  time  and  as  a  function  of  both  time  and 
satellite  generation.  Finally,  we  observed  trends  in  ROCOF 
over  a  single  satellite’s  operational  lifetime  and  discussed 
the  implications  of  all  the  observed  trends  on  the  evolution 


of  satellite  systems.  In  order  to  do  this,  we  assumed  that  an 
anomaly  entry  in  the  SOARS  database  corresponded  to  a 
failure  of  the  given  satellite,  since  anomalies  are  positively 
correlated  with  failures.  Such  an  assumption  is  reasonable 
as  long  as  the  model  used  to  fit  the  data  is  not  expected  to 
predict/measure  the  number  of  actual  failures  of  a  given 
satellite  system. 

We  can  model  reliability  growth  across  multiple  satellite 
generations  in  a  satellite  system  with  a  Crow-AMSAA 
model.  This  model  is  a  good  fit,  having  a  squared 
correlation  coefficient  that  is  close  to  one  (R2  =  0.963).  The 
fitted  model  indicates  that  there  is  30%  -  40%  reliability 
growth  for  the  TDRS  13  (TDRS  M)  satellite  compared  to 
TDRS  10  (TDRS  J),  in  terms  of  ROCOF.  The  overall  trend 
of  ROCOF  decrease  with  time  implies  an  improving  level  of 
reliability  over  time  and  thus  reliability  growth  in  the  TDRS 
family  of  spacecraft.  These  results  are  intuitive,  since  each 
satellite  generation  is  relatively  similar  in  design  to  the 
previous  one,  allowing  for  consecutive  generation  designs  to 
be  more  refined. 

We  can  model  reliability  growth  across  multiple  satellite 
generations  in  a  satellite  system  with  greater  accuracy  after 
a  slight  modification  to  the  Crow-AMSAA  model.  This 
modification  involves  introducing  a  dummy  variable,  Tj,  (j 
=  1,  2,..,J),  which  represents  the  satellite  generation  order 
number.  This  modification  allows  the  model  to  capture  any 
major  generational  changes  in  satellite  system  ROCOF  data 
due  to  new  technologies.  The  reliability  growth  of  the 
model  fitted  to  GOES  satellite  data  is  greater/more 
pronounced  among  the  newer  generation  of  GOES  satellites 
and  is  able  to  capture  and  explain  the  radical  change  in 
ROCOF  data  corresponding  to  a  significant  change  in 
technology  introduced  by  the  second  generation  of  GOES 
satellites,  beginning  with  GOES  8.  This  model  provides  a 
better  fit  than  would  have  been  possible  with  the  single 
variable  Crow-AMSAA  model  due  to  its  ability  to  capture 
the  inflection  introduced  by  GOES  8. 

We  considered  the  plausibility  of  considering  satellite 
systems,  such  as  GOES,  as  repairable  systems.  Such 
systems  experience  a  rapid  increase  in  the  reported  number 
of  failures  over  an  initial  period  of  operation,  and  maintain  a 
fixed,  less  sharply  increasing  rate  of  failures  for  an  extended 
period  of  operation,  until  finally  the  rate  of  failures 
increases  again  towards  the  end  of  system  life  (i.e.  the 
bathtub  curve  effect).  This  turned-out  to  be  a  plausible 
consideration,  since  the  observed  CIF  of  each  of  the  GOES 
family  systems  displayed  some,  if  not  all,  of  these  repairable 
system  characteristics. 

We  can  improve  the  current  models  by  introducing  a 
Bayesian  prior  distribution  over  its  parameters  (i.e.  10,  P), 
considering  them  as  random  variables,  and  employing 
Bayesian  inference,  as  opposed  to  classical  Maximum 
Likelihood  Estimation.  All  of  these  considerations  should 
be  made  in  future  studies  of  these  data. 
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